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Supervisory  Control  of  a  Database  Unit 
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Abstract  —  To  effectively  enhance  service  availability,  this 
paper  proposes  a  redundancy  configuration  for  a  database  unit 
residing  in  a  command  and  control  (C2)  system  that  supports 
air  operations.  The  results  of  modeling,  supervisory  control, 
and  performance  analysis  of  the  database  unit  are  presented. 
The  unit  is  modeled  as  a  closed  Markovian  queuing  network. 
State  variable  feedback  is  used  to  implement  the  functions  of 
restoration  and  routing  upon  the  identification  of  the  failure  of 
one  of  the  database  servers  in  the  unit.  Several  control  policies 
are  evaluated  in  terms  of  the  resulting  mean  time  to  unit 
failure,  the  steady  state  availability,  the  expected  response  time, 
and  the  service  overhead  of  the  database  unit. 

I.  Introduction 

HE  recent  effort  to  install  and  test  monitoring  tools  and 
to  increase  the  level  of  redundancy  in  critical 
subsystems  in  air  operation  centers  [1]  has  provided 
opportunities  for  vast  performance  improvement  in  its 
command  and  control  (C2  hereafter)  supporting  systems. 
Our  previous  work  on  a  controlled  C2  processing  unit  [2] 
has  demonstrated  that  reduced  response  time  to  service 
requests  and  shortened  periods  of  system  unavailability,  as  a 
result  of  automated  monitoring  and  control,  can  raise 
significantly  the  probability  to  attain  the  desired  outcome  in 
an  air  operation.  This  paper  shifts  focus  to  one  other  critical 
C2  subsystem,  a  database  unit.  A  simulation  study  [3]  has 
been  performed  recently  using  Arena  [4],  [5]  on  a  controlled 
database  unit.  The  results  indicate,  however,  that  the 
architecture  shown  in  Fig.l  is  extremely  inefficient,  where 
the  service  burden  rests  almost  entirely  on  the  primary 
server,  while  the  secondary  server,  though  indispensable  for 
the  required  system  availability,  is  rarely  utilized. 

Fig.2  shows  an  alternative  architecture  for  which  the 
potential  improvements  in  response  time  and  in  service 
availability  are  to  be  examined.  The  partition  of  the  database 
into  multiple  sets  of  data  (to  be  called  data  classes  hereafter), 
and  the  simultaneous  access  to  multiple  servers  allow  the 
reduction  of  the  response  time  to  queries,  whereas  the 
presence  of  a  secondary  data  class  in  every  server  leads  to 
fault-tolerance  and  therefore  higher  service  availability.  The 
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performance  improvement,  however,  cannot  be  achieved  in  a 
cost-effective  manner  without  a  reconfiguration  scheme 
called  a  supervisory  control  that  acts  on  the  state  information 
of  the  database  system.  This  effort  investigates  several  such 
schemes  that  differ  by  their  control  authorities.  To  assess  the 
effectiveness  of  these  schemes  in  a  quantified  manner,  the 
model  in  Fig.2  (and  that  in  Fig.l)  is  given  the  interpretation 
of  a  queuing  network  [6]  with  specific  sets  of  operating 
policies  and  structural  parameters.  The  control  authorities 
considered  include  the  ability  to  restore  the  lost  data  and/or 
the  ability  to  route  queries.  In  order  to  obtain  an  analytic 
model  of  manageable  size  for  scrutinizing  the  effects  of 
supervisory  control,  the  archiving  process  is  ignored,  and  the 
queuing  network  is  of  the  closed  type  [7].  A  simulation 
study  is  being  conducted  currently  without  these 
simplifications. 

The  paper  is  organized  as  follows.  Section  II  of  the  paper 
models  the  database  system  in  Fig.2  as  a  Markov  chain  [8] 
with  supervisory  control.  Section  III  evaluates  a  set  of 
performance  measures  under  several  supervisory  control 
policies.  Section  IV  concludes  the  paper.  Section  V 
acknowledges  the  contributions  from  our  colleagues.  Details 
of  the  database  model  are  given  in  Appendix. 


Data  Restoration 


Fig.l  Redundant  database  unit 


Fig.2  Partitioned  database  unit 


II.  Modeling  and  control 

A.  Modeling 

The  database  unit  in  Fig.2  contains  three  servers  in 
parallel  to  answer  three  classes  (A,  B,  C)  of  queries  for 
which  relevant  information  can  be  found  in  the  partitioned 
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sets  A,  B,  C  of  the  database,  respectively.  Server  Sab 
contains  database  class  A  as  the  primary  class  and  database 
class  B  as  the  secondary  class.  Server  SBc  contains  database 
class  B  as  the  primary  class  and  database  class  C  as  the 
secondary  class.  Server  SCa  contains  database  class  C  as  the 
primary  class  and  database  class  A  as  the  secondary  class. 
The  failure  of  a  server  implies  the  loss  of  two  classes  of  data 
within  the  server.  A  system  level  failure  is  declared  when 
two  servers  fail,  in  which  case  one  class  of  data  is  said  to  be 
lost.  The  queues  preceding  servers  SAB,  SBC ,  and  SCa  are 
named  QAc ,  Qbc,  and  QCa ,  respectively.  All  queues  are  of 
sufficient  capacity.  Service  is  provided  on  a  FCFS  basis  at 
each  server. 

The  three  delay  elements  imply  that  there  are  always  three 
customers  present  in  the  unit  at  any  given  time.  A  new  query 
is  generated  at  a  delay  element  upon  the  completion  of  the 
service  to  a  query  at  one  of  the  servers.  The  delay  elements 
are  intended  to  be  also  reflective  of  the  response  time  to  the 
querying  customers  by  other  service  nodes  in  the  C2 
supporting  system,  which  are  not  explicitly  modeled.  Any 
new  query  is  assumed  to  be  equally  likely  to  seek  database 
class  A  or  B  or  C.  Therefore  routing  probabilities  pAB,  psc , 
and  pcA  are  assigned  the  same  values  under  the  normal 
operation  condition. 

The  use  of  a  queuing  network  model  for  the  database  is 
based  on  its  suitability  to  involve  control  actions  and  our 
intention  to  capture  their  effects  on  the  system  performance. 
The  model  is  built  in  this  study  with  the  premise  that  event 
life  distributions  have  been  established  for  the  process  of 

query  generation  (exp(/i)  =  1  -  e~ ^ ) ,  the  process  of  service 
completion  (exp(//)) ,  the  process  of  server  failure  (exp(v)) , 
the  process  of  data  restoration  (exp(^)) ,  and  the  process  of 
unit  overhaul  (exp(&>))  when  the  failed  database  unit  is 
repaired.  All  such  processes  are  independent.  Standard 
statistical  methods  that  involve  data  collection,  parameter 
estimation,  and  goodness  of  fit  tests  [9]  exist  for  identifying 
event  life  distributions.  Since  all  event  lives  are  assumed  to 
be  exponentially  distributed,  the  database  unit  can  be 
conveniently  modeled  as  a  Markov  chain  specified  by  a  state 
space  2,  an  initial  state  probability  mass  function  (pmf) 
TlfO ),  and  a  set  of  state  transition  rates  A  [7],  [8].  The  reader 
uninterested  in  the  details  of  model  building  can  advance  to 
the  paragraph  right  above  Equation  (1). 

1)  State  space  2 

A  state  name  is  coded  with  a  d-digit  number  indicative 
of  all  queue  lengths  and  server  states  in  the  unit.  With  some 
abuse  of  notations,  a  valid  state  representation  is  given  by 
x-QabQb cQca SabSbcS ca ,  where  queue  length  QAB>  QBc,  Qca 
<e  {ft  1,  2,  3}  with  total  length  L  =  QAB+QBC+  Qca  <  3,  and 
server  state  SAB,  SBC,  SCa  £  {ft  1,  2}.  Server  state  “2”  =  data 
are  lost  in  both  the  primary  and  the  secondary  classes  in  a 
server,  “7”  =  the  data  in  the  primary  class  have  been  restored 
and  data  in  the  secondary  class  have  not  been  restored,  and 
“0”  =  data  in  both  primary  class  and  secondary  class  in  a 


server  are  intact.  A  server  is  said  to  be  in  the  down  state  if  it 
is  either  at  state  “7”  or  at  state  “2”.  For  example,  state 
110020  indicates  that  server  SAB  is  up  with  one  customer  in 
its  queue,  server  SBc  is  down  with  both  classes  of  data  gone 
and  one  customer  in  its  queue,  and  server  SCa  is  up  and  idle. 
Note  that  the  queue  length  includes  the  customer  being 
served.  There  are  540  valid  states  in  the  system.  The  total 
number  of  states  is  reduced  to  147  when  the  states  of  system 
level  failures  are  aggregated.  The  symmetry  of  the  system 
permits  the  arrangement  of  customers  in  the  queues  at  the 
time  of  system  level  failure  to  be  captured  in  one  of  seven 
states,  allowing  the  system  to  return  to  an  equivalent  state 
upon  completion  of  the  system  overhaul.  A  set  of  alternative 
state  names  are  assigned  from  2  =  {7,  2,  ...,  147}  with 
000000  mapped  to  x  =  7  and  the  aggregated  system  failure 
states  mapped  toiE  {141, 142, 143, 144, 145, 146, 147}. 

2)  Initial  state  pmf  {Tiff)),  x  =  1,2,  ...,147} 

It  is  assumed  that  the  database  unit  starts  operation  from 
state  x=  1,  i.e.,  the  initial  state  probability  is  given  by  vector 
7i(0)  =  [1  0  ...  0\.  When  overhaul  is  considered  at  the 
occurrence  of  a  system  level  failure,  the  system  returns  to  a 
state  with  an  equivalent  arrangement  of  customers  in  the 
queues  once  the  database  unit  is  renewed  [8]  and  ready  for 
operation  again. 

3)  Set  of  state  transition  rates  A 

A  transition  rate  table  containing  all  transition  rates  is 
created  following  a  similar  procedure  as  that  described  in 
[10],  however  with  a  more  compact  representation.  The  state 
transition  table  is  given  in  Appendix.  The  list  of  current 
states  occupies  the  first  column  of  the  table.  In  the  row 
corresponding  to  each  state,  the  set  of  all  feasible  next  states 
are  listed  with  each  next  state  followed  by  the  rate  at  which 
the  next  state  is  reached.  Events  that  trigger  the  transitions 
and  the  corresponding  transition  rates  are  given  as  follows. 
A  newly  generated  query  enters  one  of  the  servers  with  rate 

pU2(3-L)X/ 1  where  pU2  is  a  controlled  routing 
probability  by  control  variable  U2.  A  query  is  answered  at  a 
server  with  rate  p.  A  complete  data  loss  occurs  at  a  server 
with  rate  v.  Data  in  the  primary  data  class  of  a  server  are 
restored  with  rate  yp  uj  where  uj  authorizes  whether  to 
restore  the  lost  data.  Data  in  the  secondary  data  class  of  a 
server  are  restored  with  rate  ys  uj.  Finally,  the  failed  database 
unit  is  renewed  with  rate  cou3,  where  u3  decides  whether  to 
repair  the  failed  system.  All  rates  are  relative,  for  their  net 
effects  depend  on  the  time  unit  specified. 

Let  X  ^  2  denote  the  random  state  variable  at  time  t.  The 
set  of  state  transition  functions 

P-Jt)  =  P[X(t)  =  j  I  X(0)  =  i],i,j  =  1,2, -,147  (1) 

for  the  continuous-time  Markov  chain  can  be  solved  from 
the  forward  Chapman-Kolmogorov  equation  [7] 

Pit)  =  P(t)Q ,  P(0)  =  I,  Pit )  =  [PiJ(t)\  ,  (2) 

where  Q  is  called  an  infinitesimal  generator  or  a  rate 
transition  matrix  whose  (i,j)th  entry  is  given  by  the  rate 
associated  with  the  transition  from  current  state  i  to  next 
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state  j  in  the  rate  transition  table.  State  probability  mass 
function  at  time  t 

7t(t)  =  [x,(t)  n2(t)  •••  n,47(t)\,t>0  (3) 

is  computed  by 

n(t)  =  n(0)P(t) .  (4) 

At  this  point  a  Markov  model  for  the  database  unit  of 
Fig.2  has  been  established.  The  state  probabilities  are  the 
basis  for  evaluating  the  performance  of  the  database  unit, 
which  is  conducted  in  Section  III. 


the  customers  who  have  arrived  at  a  server  before  the  server 
fails  to  the  delay  elements. 

The  presence  of  supervisory  control  in  the  transition  rate 
table  is  seen  via  uj,  U2,  U3,ni  =  1-uj ,  ri2  =  I-U2 ,  and  n$  =  I-U3. 
The  values  of  uj,  u2 ,  u3  represent  specific  control  actions 
associated  with  data  restoration,  query  routing,  and  unit 
overhaul,  respectively. 

III.  Performance  analysis 


B.  Control  policies 

Our  ultimate  goal  is  to  eliminate  all  single  point  failures, 
and  to  mitigate  the  effects  of  a  single  server  failure  on  the 
performance  of  the  database  unit.  Our  approach  is  to  base 
the  supervisory  control  actions  on  the  state  information, 
which  effectively  alter  the  transition  rates  when  loss  of  data 
occurs  in  a  single  server. 

Taking  into  consideration  the  symmetry  of  the  model,  the 
control  policy  is  described  only  for  the  case  of  a  failed  server 
SAB.  When  routing  control  is  effective,  the  routing 
probabilities  are  determined  by  the  state  of  SAB  and  by 
whether  the  lost  data  can  be  restored.  Thus, 

PU2  =  Pab(s AB’ui)>  Pbc(Sab->ui)>  Pca(Sab>ui) 

Pab  +  Pbc  +  Pca  -  1  •  The  control  policies  considered  for 
this  study  are  summarized  as  follows. 

0,  SAB  =  2,  SBC  serves,  SCA  serves  (no  restoration) 

Uj  =<  1  =2,  SBC  serves, SCA  restores  class  A  data’ 

1  Sab  ~  f  SCA  serves,  SBC  restores  class  B  data 

0’  sab  =  2,  PAB  =  pBc  =  Pca  =  — 

U2  ~  <  ^  I sab  -  2,  pAB(2,u1),pBC(2,u1),pCA(2,u1) 

1  $ab  =  1’  PabO’ui)’PbcC’ui)’Pca(1’ui) 

Four  sets  of  routing  probabilities  are  shown  in  the 
following  table  as  examples,  where  SBC^0  and  SCa=0  are 
assumed. 


Table  1  Examples  of  routing  probabilities 


Uj 

u2 

Sab 

Pab 

Pbc 

Pca 

0 

1 

2 

0 

1/2 

1/2 

1 

0 

2(1) 

1/3  (1/3) 

1/3  (1/3) 

1/3  (1/3) 

1 

1 

2(1) 

0  (1/6) 

2/3  (1/6) 

1/3  (2/3) 

1 

1 

2(1) 

0(0) 

1(0) 

0(1) 

The  composition  of  uj  and  u2  gives  rise  to  four  different 
control  policies.  The  case  of  (uu  u2)  =  ( 0 \  0 )  corresponds  to 
the  case  of  a  single  point  failure,  and  is  therefore  not 
considered  in  the  performance  analysis.  The  control  policies 
in  the  other  three  cases  are  named 

Policy  1 :  (w7,  u2 )  =  (1 0 ,  1)  when  a  server  is  down, 

Policy  2:  (w7,  u2)  =  (1,  0)  when  a  server  is  down,  (7) 

Policy  3:  (w7,  u2 )  =  ( 1 ,  1 )  when  a  server  is  down. 

Note  that  policy  2  does  not  permit  routing,  whereas  policy 

1  does  not  permit  restoring.  As  can  be  seen,  policy  3  allows 
variations  in  the  routing  probabilities  to  the  intact  servers.  A 
special  consideration  with  the  case  U]=0  is  the  rerouting  of 


A.  Time  to  system  failure 

When  u3  =  0,  the  Markov  chain  model  for  the  database 
unit  contains  seven  absorbing  states  xe  {14 7,  142 ,  143 ,  144 , 
145 ,  146 ,  147}  at  which  the  chain  remains  forever  once  it  is 
entered.  These  are  the  states  of  system  level  failure.  The  rest 
of  the  140  states  are  transient  states.  Decompose  the  state 
probability  vector 

1x140  1x7 

where  vector  7ift)  contains  the  transient  state  probabilities, 
and  7ijf)  are  the  absorbing  state  probabilities.  Decomposing 
the  rate  transition  matrix  Q  and  the  state  transition  function 
matrix  P(t )  solved  from  (2)  accordingly  yields 


' Qn  Qn ' 

,P(t)  = 

~p„(0 

PA  O' 

0  0 

0 

1 

From  (2),  (4),  and  (9),  it  can  be  determined  that  the 
probability  density  function  of  time  to  system  failure,  or 
time  to  absorption,  is  given  by 

=  nA0)Pii{t)Qi2,  na(0)  =  0,  (10) 

where 

xT(0)  =  [l  0  •••],  Pn(t)  =  eQllt.  (11) 

In  addition,  the  mean  time  to  failure  of  the  database  unit  can 
be  shown  to  be  [8]. 

MTTF  =  -xt(0)Q;I  1tJt  =  [l  -  /]'  (12) 

Fig.3  below  shows  the  dependence  of  mean  time  to  failure 
of  the  database  unit  on  the  restoration  rate. 


A=6,  n=12.  v=0.005  co=0.01 


Fig.3  Database  unit  mean  time  to  failure  versus  restoration  rate 
B.  Steady-state  availability 

Suppose  as  soon  as  the  database  unit  reaches  a  system 
level  failure,  an  overhaul  process  starts.  Suppose  with  a  rate 
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co  the  unit  is  repaired,  and  at  the  completion  of  the  repair,  the 
unit  immediately  starts  to  operate  again.  In  this  case  u3  is  set 
to  1  in  the  model,  whereas  it  is  set  to  0  in  the  case  of  an 
absorbing  chain.  The  existence  of  a  unique  steady-state 
distribution  of  the  Markov  chain  when  u3=l  is  guaranteed  if 
the  chain  is  irreducible  (or  ergodic)  [7].  Ergodicity  is 
satisfied  under  policy  2  and  policy  3.  Although  ergodicity  is 
not  met  under  policy  1  without  eliminating  the  few 
unreachable  states  in  this  case,  a  unique  steady  state 
distribution  is  obtained  nevertheless  in  our  computation.  The 
steady  state  availability,  which  can  be  roughly  thought  of  as 
the  fraction  of  time  the  database  unit  is  up,  is  given  by 

Asys=1-^F(,0°\  (13) 

where  ;zy(°o)is  the  sum  of  the  system  level  failure  state 
probabilities,  determined  by  solving 

xMQ  =  0,  and  =  (14) 

Fig.4  shows  the  steady-state  availability  as  a  function  of 
restoration  rate  at  a  fixed  overhaul  rate.  Fig.  6  demonstrates 
the  benefit  of  the  success  in  supervisory  control  to  steady- 
state  availability. 


fc=6.  |i=12,  v=0.005  oo=0.01 


Fig.4  Steady-state  availability  of  the  database  unit  versus 
restoration  rate 

C.  Response  time 

The  average  response  time  E[R ]  is  the  expectation  of  the 
ratio  of  total  amount  of  time  that  all  customers  spend  in  the 
upper  portion  of  the  system  to  the  number  of  customers  that 
are  serviced.  A  loose  argument  is  given  below  to  justify  the 
way  E[R ]  is  computed  in  this  paper.  Define  the  vector  C 
where  c(i)  is  the  number  of  customers  in  the  system  at  state 
i.  The  numerator  of  E[R]  is  then  7r(co )Ct.  Computing  the 
number  of  customers  that  are  serviced  requires  counting  the 
number  of  transitions  from  one  state  to  another  that  have 
occurred  that  have  introduced  a  new  customer  to  the  system. 
Define  a  matrix  N  such  that  n(ij)  is  equal  to  the  number  of 
customers  introduced  into  the  system  when  the  system 
transitions  from  state  i  to  state  j.  The  total  number  of 
transitions  for  a  given  i  and  j  is  then 

T(i,j)  =  tN '(i,j)ni(°°)Q(i,j)  ■  (15) 

Therefore,  the  average  response  time  E[R]  of  the  system  is 


taken  as 

_ 7r{oo)C _  ^ 

147147  147147 

X  X T (i,  j)  X  X  N(i,  j)7Ti(°°)Q(i,  j) 

i=l  j=l  i=l  j=l 

Fig.5a  and  Fig.6  show  the  average  response  time  as  a 
function  of  restoration  rate  with  the  overhaul  rate  fixed,  and 
a  function  of  overhaul  rate  with  the  restoration  rate  fixed, 
respectively,  for  all  three  policies.  The  routing  probabilities 
in  rows  1  through  3  in  Table  1  are  in  fact  used  for 
calculating  all  performance  measures  resulting  from  Policies 
1  through  3,  respectively.  Policy  1  enjoys  a  lower  response 
time  because  the  intact  servers  need  not  deny  customers  in 
order  to  restore  the  failed  server.  Also,  customers  present  at 
the  time  of  server  failure  in  policy  1  are  emptied  into  the 
delay  elements  and  incur  no  response  time  gains. 

,  v  2.=6,  [i=12,  v=0.005  05=0.01  . 


Fig. 5  Average  query  response  time  versus  restoration  rate 

Fig.5b  shows  the  effect  of  applying  Policy  3*:  routing  all 
customers  to  the  intact  server  that  is  not  restoring  the  failed 
server,  an  alternative  to  Policy  3.  The  reduced  response  time 
in  policy  3*  results  from  customers  not  waiting  at  a  failed 
server.  This  policy  may  not  be  as  advantageous  in  a  system 
of  higher  traffic  intensity. 


2=6,  [i=12,  v=0.005  y=0.05 


Fig.6  Average  query  response  time  versus  overhaul  rate 
D.  Overhead 

Overhead  is  a  quantity  introduced  to  reflect  the  ratio  of  the 
time  invested  on  helping  the  database  unit  to  survive  longer 
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to  its  overall  busy  time.  It  is  a  measure  of  the  cost  of 
supervisory  control.  More  specifically, 

Pf[  Ais  restores  or  fails  |  unit  is  not  failed]  ^  ^ 


0  =  - 


Pr [SAB  restores  or  fails  or  serves  |  unit  is  not  failed] 


Overhead  #is  calculated  for  both  the  absorbing  chain  (u3  = 
0 )  as  a  function  of  time,  and  the  irreducible  chain  (u3=  1)  as 
a  function  of  server  failure  rate.  These  are  shown  in  Fig.  7 
and  Fig.8.  In  Fig.7,  it  is  seen  that  restoration  incurs  a  higher 
overhead  in  the  early  life  of  the  unit.  As  the  database  unit 
ages,  its  server  becomes  more  likely  to  fail.  A  control  policy 
that  permits  restoration  becomes  advantageous.  There  is  a 
reduction  in  overhead  across  all  polices  with  an  increase  in 
the  arrival  rate  because  of  the  resulting  increased  utilization. 
In  Fig.8,  for  sufficiently  low  server  failure  rate,  overhead  is 
always  lower  with  restoration.  When  server  failure  rate 
passes  some  threshold,  however,  restoration  becomes 
expensive.  Overhead  is  expected  to  gain  more  significance 
as  a  function  of  time  and  a  function  of  server  failure  rate 
when  the  server  life  distributions  have  an  increasing  failure 
rate,  such  as  in  the  case  of  Weibull  distribution. 


Absorbing  System  (n.= 12,  v=0.005  y=0.05  co=0) 
(a)  X  =  6  (b)  X  =  50 


Fig.7  Overhead  versus  time  in  the  absorbing  chain 


Irreducible  System  (|J.=12,  7=0.05  co=0.01) 

(a)  X  =  6  (b)  X  =  50 


Fig.8  Overhead  versus  failure  rate  in  the  irreducible  chain 

IV.  Conclusions 

This  paper  modeled  a  redundant  database  unit  in  C2  for 
investigation  of  fault-tolerance  and  responsiveness  afforded 
by  a  set  of  supervisory  control  policies.  In  all  the 
performance  measures  examined,  restoration  (uj)  is  more 
effective  than  routing  (u2).  It  is  expected  that  when  the 
number  of  queries  increase,  or  the  traffic  becomes  more 
intensive,  the  effectiveness  of  routing  will  be  more  apparent. 


The  study  presented  in  this  paper  is  limited  by  our  ability 
to  deal  with  complex  problems  analytically.  Most  restrictive 
is  the  size  of  the  state  space.  The  closed-queuing  network 
model  shown  in  Fig.  2  presents  perhaps  the  smallest  possible 
state  space  for  which  the  investigation  on  control  policies  is 
nontrivial.  Besides  answering  queries,  the  database  unit  also 
must  be  updated  from  time  to  time.  In  that  case,  two  types  of 
service  requests  exist  and  the  state  space  must  be  expanded. 
Almost  equally  restrictive  is  the  assumption  that  times  to 
event  occurrence  are  exponentially  distributed.  Since  there  is 
only  one  parameter  in  an  exponential  distribution,  it  is  likely 
to  be  unsuitable  to  truthfully  describe  some  of  the  processes. 
Discrete  event  simulations  are  being  carried  out  where  the 
simplifying  assumptions  are  removed  to  substantiate  our 
claims  on  the  benefit  of  supervisory  control  under  more 
general  settings  in  terms  of  the  types  of  services,  the  number 
of  customers,  and  the  types  of  distributions  of  event  lives. 

Also  ongoing  is  the  extension  of  this  study  to  incorporate 
the  effect  of  decision  and  control  under  uncertainty  and  time 
delay  due  to,  for  example,  incomplete  state  information  and 
the  time  required  for  state  estimation,  respectively.  The 
results  will  be  reported  in  a  future  paper. 
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Appendix 


Table  2  Transitions  and  transition  rates  of  the  database  unit 
model  with  all  rates  valid  for  all  policies,  based  on  which  matrix  Q 


is  formed 
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